Notes on MapReduce Algorithms
نویسنده
چکیده
Set k = N c 2 . Partition V into k parts of equal size: V1, V2, ...., Vk with Vi ∩ Vj = φ for i 6= j and |Vi| = Nk for all i. . Let Ei,j ⊆ E be the set of edges induced by the vertex set Vi ∪ Vj , that is Ei,j = {(u, v) ∈ E | u, v ∈ Vi ∪ Vj}. Distribute Gi,j = {Vi ∪ Vj , Ei,j} to each server and compute its minimum spanning tree Mi,j . Distribute H = ∪Mi,j to a single server and compute the final MST M of H return M
منابع مشابه
Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming
The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...
متن کاملAdaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کاملExperiences on Processing Spatial Data with MapReduce
The amount of information in spatial databases is growing as more data is made available. Spatial databases mainly store two types of data: raster data (satellite/aerial digital images), and vector data (points, lines, polygons). The complexity and nature of spatial databases makes them ideal for applying parallel processing. MapReduce is an emerging massively parallel computing model, proposed...
متن کاملExperimental Evaluation of Multi-Round Matrix Multiplication on MapReduce
This paper proposes an Hadoop library, named M3, for performing dense and sparse matrix multiplication in MapReduce. The library features multi-round MapReduce algorithms that allow to tradeoff round number with the amount of data shuffled in each round and the amount of memory required by reduce functions. We claim that multi-round MapReduce algorithms are preferable in cloud settings to tradi...
متن کاملAdapting Skyline Computation to the MapReduce Framework: Algorithms and Experiments
This paper addresses the problem of skyline computation under the MapReduce framework. As a parallel programming model for data-intensive computing applications, MapReduce runs on a cluster of commercial PCs with the main idea of task decomposition and result reduction. Based on different data partitioning strategies, three MapReduce style skyline computation algorithms are developed: MapReduce...
متن کامل